Tests of Statistical Significance and Background Estimation in 
Gamma Ray Air Shower Experiments. 
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ABSTRACT 

In this paper we discuss several methods of significance calculation and point out the limits 
of their applicability. We then introduce a self consistent scheme for source detection and discuss 
some of its properties. The method allows incorporating background anisotropies by vetoing 
existing small scale regions on the sky and compensating for known large scale anisotropies. By 
giving an example using Milagro gamma ray observatory we demonstrate how the method can 
be employed to relax the detector stability assumption. Two practical implementations of the 
method are discussed. The method is universal and can be used with any large field-of-view 
detector, where the object of investigation, steady or transient, point or extended, traverses its 
field of view. 



Subject headings: atmospheric effects- 
tical 

Introduction 



-methods: data analysis — methods: numerical — methods: statis- 



The problem of evaluation of statistical signif- 
icance of observations when searching for gamma 
ray sources using air shower experiments remains 
one of highest importance. The emission from a 
source would appear as an excess number of events 
coming from the directions of the candidate over 
the background level. The difficulty arises because 
the signal to background ratio as registered by the 
detectors in this energy range is often quite unfa- 
vorable, requiring careful examination of data. 



In this paper, we expand on the usually adopted 
procedure of the significance calculation described 
by Li and Ma (1983), in particular on the condi- 
tions of its applicability. The prescription relies on 
the knowledge of the expected background level, 
methods of estimation of which are reviewed in 
Alexandreas et al. (1993). However, the stan- 
dard significance calculation method is not com- 
patible with these methods of background estima- 
tion. In this paper we introduce a self consistent 
scheme for a source detection and discuss some of 
its properties. The method is applicable to point 
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and extended source searches as well as to searches 
for transient phenomena. We show how practical 
problems specific to an experiment can be incor- 
porated into the method. 

The methods described in this paper were 
developed for, and applied in two gamma ray 
searches (Fleysher, L. 2003; Fleysher, R. 2003) 
using the Milagro water Cherenkov air shower de- 
tector (Atkins ct al. 2000). 

2. Gamma ray astronomy using the air 
shower technique. 

A typical air shower detector registers particles 
from air showers that survive to the ground level. 
The recorded information is used to provide the 
direction of the incident primary particle and per- 
haps provide some information on its energy and 
type. Among the particles entering the Earth's 
atmosphere gamma rays present a very small frac- 
tion, often less that 10~ 3 . Most of the air show- 
ers are induced by charged cosmic rays that form 
a background to the search for gamma initiated 
showers from a source. Special techniques and 
algorithms have been developed to suppress this 
background in order to increase the sensitivity to 
gamma primaries. These, however, are limited due 
to similarities of the cascades produced by pri- 
maries of both types. The application of these 
techniques helps but does not solve the problem 
of gamma ray source detection in the presence of 
a background. Therefore, one of the problems in 
gamma ray astronomy using air shower technique 
is to be able to determine the level of background 
(Alcxandreas et al. 1993). This problem is rather 
difficult if one tries to calculate it from the first 
principles, because it would require exact knowl- 
edge of the details of the detector operation, its 
sensitivity which may depend on voltages, tem- 
perature, properties of atmosphere and direction 
reconstruction algorithms. The problem is solved 
by measuring the background level using the same 
instrument. 

Thus, in a typical experiment, two measure- 
ments are performed — one corresponding to the 
observation of the candidate (so called on-source) 
and the other is the measurement of the corre- 
sponding background level (so called off-source ob- 
servation). Then, a decision is made as to the 
plausibility of the existence of the source. Because 



the results of the on- and off-source observations 
represent random numbers drawn from their re- 
spective parent distributions, the question of the 
existence of the source is the question of whether 
the numbers are drawn from the same or different 
parent distributions. It is addressed by a hypoth- 
esis test. 

3. Li Ma statistic. 

Many statistical tests have been used to test 
the null hypothesis of the absence of a source given 
two independent counts N\ from the on-source and 
N 2 from the off-source regions accumulated during 
time periods t\ and t 2 respectively with all other 
conditions being equal. An improvement was pro- 
posed by Li and Ma (1983) and is based on the 
test statistic: 



U = 



iVi - aN 2 
^a(N 1 +N 2 ) 



a = t 1 /t 2 >0 (1) 



Because each event carries no information 
about another, each of the observed counts can 
be regarded as being drawn from a Poisson distri- 
bution with some value of the parameter (adjusted 
for the duration of observation). The motivation 
provided by the authors is that the numerator 
may be interpreted as the excess number of events 
from source over the expected background and 
denominator is the maximum likelihood estimate 
of the standard deviation of the numerator given 
that the null hypothesis is true. 

It has been argued by the authors (Li and Ma 
1983) that for the case of large N 1<2 {N 1>2 > 10), 
and if the null hypothesis is true, the distribu- 
tion of U becomes Gaussian with zero mean and 
unit variance. This statement is based on the well 
known fact that Poisson distribution approaches 
that of Gaussian for large values of parameter \x: 
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-(n-liY/2^ 



For a measured value u of U, the calculation 
of the p- value (which we denote by £) becomes 
simple: 
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when looking for a source and 
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when looking for a sink. 

The null hypothesis is rejected with significance 
£ c if £ < £ c . The significance £ c is set in advance, 
before the test is performed and its choice is based 
on the penalty for rejecting the null hypothesis 
when it is true. (Scientific false discoveries should 
not happen very often, and thus the significance 
is usually selected as £ c = 10~ 3 .) Because of the 
one-to-one correspondence between £ and u, the 
significance of a measurement can be quoted in 
the units of U. 

3.1. Conditions of applicability of Li Ma 
statistic. 

When the null hypothesis is true, the distri- 
bution of statistic U (equation 1) approaches the 
normal distribution in the limit of large num- 
bers. Indeed, substituting the factorial in the Pois- 
son distribution using the Stirling formula n\ w 
\J2-Kn n n e~ n , one obtains: 



or 



In (p^(n)V2mj wn^l-ln 



Expanding the right hand side into Taylor series 
in n in the vicinity of fM, denoting S = (n — fi) and 
keeping first two non-zero terms, we obtain: 
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e -5 2 /2M. e <5 3 /6p 2 



Thus, it is seen that the Poisson distribution ap- 
proaches that of Gauss in a narrow region around 
its mean: |<5 3 /6/i 2 | = I (n — /i) 3 /6^ 2 | << 1 with 
(i > 6. Substituting n with Ni t 2 and n with corre- 
sponding estimates of [i = (N\ + iV 2 )ii, 2/(^1 + h) 
we obtain the region around zero |u| < \ui,\ where 
the distribution of statistic U is approximately 
normal. That is, for 



\u\ < \u b \ « 




(a) Values of Ni and N2 are drawn from Poisson distri- 
bution with averages equal to 500 and 5000 respectively 
(a = 0.1). According to equation (2), Uf, <C 5.3. 




(b) Values of JVi and N2 are drawn from Poisson distri- 
bution with averages equal to 15- 10 5 and 15- 10 6 respec- 
tively (a = 0.1). According to equation (2), u b <C 20.3. 



Fig. 1.— Distributions of statistic U (equation 1) 
obtained in two runs of Monte Carlo simulations. 
Parameters of the best fit to a Gaussian distribu- 
tion are listed in the box. The two curves should 
agree in the u^-neighborhood around zero. The 
number of entries in each run (about 10 9 ) was cho- 
sen to provide reasonable accuracy in the region 
plotted. 




(a) Values of Ni and N2 are drawn from Poisson distri- 
bution with averages equal to 500 and 5000 respectively 
(a = 0.1). According to equation (2), u' b -C 5.3. 




(b) Values of Ni and N2 are drawn from Poisson distri- 
bution with averages equal to 15- 10 5 and 15 -10 6 respec- 
tively (a = 0.1). According to equation (2), u' b <C 20.3. 



Fig. 2. — Distributions of statistic U' (equation 3) 
obtained in two runs of Monte Carlo simulations 
similar to that presented in figure 1. 



min (^/36a(l + a) 2 (JVi +JV 2 ), (2) 
i/36a-Hl + ap(N 1 +N 2 )^ 

the error on the p-value £ due to this approx- 
imation does not exceed 1/\/2tt f + °° e~ x2 / 2 dx. 
Figure 1 shows the results of a Monte Carlo simu- 
lations for distribution of the statistic U. It can be 
seen that the distribution is approximately normal 
in the vicinity of zero. 

By the same arguments it may be shown that 
within essentially the same region around zero, an- 
other statistic U' 



U' 



:N 2 



(3) 



s/N[ + a 2 N 2 

is also distributed normally. The motivation for 
statistic U' is similar to that of statistic U (equa- 
tion 1) that the numerator may be interpreted as 
the excess number of events from source over the 
expected background but the denominator is the 
maximum likelihood estimate of the standard de- 
viation of the numerator given that the alternative 
hypothesis is true. (The alternative hypothesis in 
this case is that both observations N\ and iV 2 are 
from Poisson distributions with unrelated means.) 
Although this motivation appears to be incorrect 
and the statistic was abandoned by Li and Ma 
(1983), the critical range of U' may be defined as 
u' > u' for testing the null hypothesis against the 
presence of a source and u' < u' for testing against 
the presence of a sink. Because under the condi- 
tions of applicability of Li Ma statistic (equation 
2) statistic U' is distributed normally, the p-valuc 
calculation is identical to that of for statistic U. 
The figure 2 presents the results of Monte Carlo 
simulations of distribution of statistic U'. 

In general, a hypothesis test may be based on 
any statistic if its distribution under the null hy- 
pothesis in known. 

It is interesting to note that equation (2) can be 
used to aid in the design of an experiment. Indeed, 
if the relative on- to off-source region exposure can 
be estimated before the experiment is performed, 
then equation (2) allows estimating the observa- 
tion time needed to collect enough events to reach 
the accuracy of statistics U or U' compatible with 
the desired significance £ c . For example, if a s=s 0.1 
and the significance in units of U is set at 3.0, then 
the experiment (duration of observation) has to be 
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designed in such a way as to allow accumulation 
of at least 10 5 events from the on-source region. 
If the significance is set at 5.0, then number of 
on-source events should be at least 10 6 . 

4. Background estimation. 

In order to be able to implement any of the 
above hypothesis tests, one must assure that the 
two measurements N\ and N2 are independent and 
that the ratio of observation times a is available 
while other conditions are equal. Indeed, exam- 
ining a typical scenario of a gamma ray experi- 
ment it is seen that on- and off-source observa- 
tions can be performed at the same time utiliz- 
ing the wide field of view of the detector, or they 
can be performed at different times making mea- 
surement in the same local directions of the field 
of view. (Due to the Earth's rotation, the off- 
source bin may present itself in the directions of 
local coordinates previously pointed at the source 
bin.) Both of these stipulations could contradict 
to the conditions of "being equal" : if observations 
are done at the same time, then non-uniformity 
in the acceptance of the array to air showers due 
to detector geometry must be compensated for; 
if observations are done at different times, then 
any time variation in detector operation must be 
addressed. Under these varying conditions, the 
meaning of the parameter a must be changed to 
the effective ratio of exposures of the bins. The 
mechanism of such an equalization and a deter- 
mination is called background estimation. The 
name is due to interpretation of the second term 
of the numerator of equation (1) as the expected 
number of background events in the source region: 
Nb = aN 2 . Correspondingly, the number of events 
Ni obtained from the direct source observation 
will be denoted N s = N\. Below we consider two 
methods of background estimation: direct integra- 
tion and time swapping. 

4.1. Isotropy and stability assumptions. 

A widely accepted method of background esti- 
mation (Alexandreas et al. 1993) recognizes that 
usually no major changes in the detector config- 
uration are made on short time scales and takes 
advantage of the rotation of the detector with the 
Earth which sweeps the sky across the detector's 
field of view. It also recognizes that most air show- 



ers detected are produced by charged cosmic rays. 
Because of their charge and because of the pres- 
ence of random magnetic fields in the interstellar 
medium, the cosmic ray particles lose all memory 
of their initial directions and sites of production, 
and can be regarded as forming isotropic radia- 
tion. Detector configuration stability implies that 
the acceptance of the detector is time independent 
although variations in the overall rate of detected 
events are allowed. (An example of such rate vari- 
ations could be an event rate decrease caused by 
a temporary data acquisition system overload.) 
Therefore, the average number of detected events 
as a function of local coordinates x and time t on 
the short time scale can be written in the form: 

dN (x, t) = G(x) ■ R(t) dxdt (4) 

Here R(t) is overall event rate, G(x) — accep- 
tance of the array such that f, . ,, , . G(x) dx = 

J J field of view v ' 

1. The local coordinates x could be either hour 
angle and declination or zenith and azimuth. The 
average number of background events expected in 
the source bin, is then given by 



N b = J J (1 - </>{x,t)) G(x) R(t) dxdt (5) 

where (j)( x i t) is equal to zero if x and t are such 
that they translate into inside of the source bin, 
and is one otherwise. The isotropy and stability 
assumptions (equation 4) become part of the null 
hypothesis being tested. 

4.2. Direct integration method. 

The direct integration method of source detec- 
tion is based on isotropy and stability assumptions 
(equation 4) and is the method where the integra- 
tion of equation (5) is performed numerically by 
discretizing both G(x) and R(t) on a fine grid and 
replacing integrals by sums. The significance test 
is based on either statistic (1) or (3). The ac- 
ceptance and the event rate are estimated by his- 
togramming local coordinates x and event times 
t of the events collected during integration time 
period from the entire sky. The fluctuations in Nb 
are dominated by the ones in G(x) because the 
event rate R(t) is collected from the entire sky and 
may be deemed as known to high precision. In this 
scheme, the source region defined by <f)(x, t) also 
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gets discretized, therefore, source count N s must 
be obtained using the same discretized definition 
of the source region. Extending the time integra- 
tion window is equivalent to increasing exposure to 
the off-source bin, which leads to decreasing value 
of a and improved sensitivity. The assumption 
(4), however, must hold during the entire integra- 
tion period placing a constraint on the maximum 
size of the off-source bin. The time integration 
window is limited by 24 hours of sidereal day. 

4-2.1. Eliminating on-source events from back- 
ground estimation. 

The realization of the direct integration method 
just described includes on-source events N s in 
the calculation of expected background Nb (via 
G(x) and R(t)). This, however, is inconsistent 
with their independence required by Li Ma statis- 
tic (equation 1) and was already recognized in 
(Alexandreas et al. 1993). An extreme example 
would be the case of sighting of the North Celestial 
Pole. There, the source does not present any ap- 
parent motion in local coordinates because it lies 
on the axis of rotation of the Earth. The off-source 
bin does not exist, the off-source count N 2 and the 
ratio of exposures a are not defined and the mea- 
surement can not be performed using isotropy and 
detector stability assumptions (equation 4) . In the 
framework of the direct integration method, how- 
ever, the background Nb is guaranteed to be esti- 
mated exactly equal to N s and therefore U = 0. 
This is clearly unsatisfactory. 

In order to be able to use either of the statistics 
(1) or (3) the events from the source bin should be 
excluded from the background estimation. How- 
ever, simply removing all of these events from the 
procedure will destroy its foundation that the lists 
of local coordinates and times represent samples 
from G(x) and R(t) respectively. A solution to 
this problem follows. 

Denote by tjj(x,t) a function similar to (f>(x,t) 
which defines the region of the sky events from 
which are to be excluded from the background 
estimation. The excluded region should contain 
the candidate source bin, but is not limited to it. 
Also denote by N out (x,t) the number of detected 
events originating from outside of the excluded re- 
gion, and R ou t(t) their total event rate, then it is 
readily seen that 




® Declination Hour Angle 



Fig. 3. — To the formulation of background equa- 
tions (6). 



dN out (x,t) = ip(x,t)G(x)R(t) dxdt 

Integrating this equation with respect to t and 
x a system of equations on unknown G(x) and 
R(t) is obtained (N out (x) and R ou t(t) are available 
experimentally) : 

/ N out (x) = G(x) J ip(x,t') ■ R(t') dt' 

{ R out {t) = R{t) ] 4,{x' ,t) ■ G{x') dx' {0) 

The numerical solution of these integral equa- 
tions provides R(t) and G(x) based on data 
Nout(x) and R ut(t) from the outside of the ex- 
cluded region to be used in equation (5). The situ- 
ation is illustrated on figure 3. The heavily shaded 
area is the outside of the excluded region bounded 
by il)(x, t) in its discrete form, events from which 
may be used for the off-source observation. The 
region of interest, the on-source region, is defined 
by some other conditions (f>(x, t) which are irrele- 
vant for the background equations (6) as long as 
it is contained in the excluded region. 

It can be noted that both G(x) and R(t) en- 
ter into equations (6) and (5) only as a product 
G(x) ■ R{t), therefore, normalization of either of 
them does not make any difference as long as the 
product is preserved. Also, if there are points 
{xq} in the local coordinates which are always in- 
side the excluded region, which may happen if the 
detector was operational during a short time pe- 
riod and/or the excluded region was large, (that is 
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iP({x },t) = 0, Vi) then N out ({x }) = and the 
first equation of (6) becomes: 

= G({x }) • 

leading to G({xo}) and integral in equation (5) 
being undefined. On-source events with local co- 
ordinates from these regions must be discarded as 
having no corresponding background estimate. It 
is thus seen that the method entails that the sec- 
ond, off-source region is defined by the regions of 
the local sky which have the opportunity to present 
themselves into the directions of the source region 
due to the Earth's rotation during the time pe- 
riod of integration. Different parts of the source 
region have different corresponding off-source re- 
gions. This leads to the ratio of exposures of on- 
and off-source regions a that is dependent on the 
local coordinates x: 

N b (x) _ f(l-<l>(x,t))R(t)dt 
X) N out {x) J ip(x,t)R(t)dt 

The off-source region corresponding to a given 
on-source region <f>(x, t) is not a celestial bin, it is 
a set of local directions with a(x) > 0. Because 
the measurements made from different local direc- 
tions x are independent, all measurements can be 
combined to obtain the compound statistic U: 

or 

jj _ N s - N b 

The described method is the integration scheme 
which is based on the direct integration method 
and which properly estimates the ratio of expo- 
sures a(x) and accounts for the source events. 

The importance of the source region exclusion 
is illustrated on figure 4 where results of the com- 
puter simulations for a Galactic plane observa- 
tion is presented. Detection of an extended source 
such as Galactic plane presents a difficult exam- 
ple because the ratio of on- and off-source expo- 
sures a(x) varies dramatically over the area of 
the source. The figure shows the excess number 
of events (N s — N b ) extracted from a simulated 



galactic signal (IS i\ function of Galactic latitude. 
The excess is recovered correctly by the modified 
method (equations 5 and 6) proposed here (figure 
4(b)) compared to the standard direct integration 
method (equation 5) (figure 4(a)). Use of the stan- 
dard method would lead to a 25% loss in both the 
excess number of events and in value of statistic 
U which are recovered by the modification. 

4.3. Time swapping method. 

In the time swapping method of source detec- 
tion the integration of equation (5) is performed 
by means of Monte Carlo, which leads to: 

M N 1 N 

N * = ~i E(! - ^D 1 - **)) 
i=i i=i 

(7) 

where N generated events (x i7 ti) are dis- 
tributed according to joint probability density 
G(x)R(t) with R(t) = R(t)/N , N being the total 
number of events detected during integration time 
window. A list of all coordinates of the detected 
events is regarded as a sample from the G(x) dis- 
tribution, while a similar list of all times as the 
one from R(t). Therefore, sample from G(x)R(t) 
distribution can be generated from the data by 
randomly associating an event's local coordinate 
x with another event's time t among the pool of 
detected events. The so created coordinate-time 
pair is called a generated event. The accuracy of 
Monte Carlo integration increases with the num- 
ber N of generated events. In the time swapping 
method, the function <f)(x, t) defining the source 
region does not have to be discretized as it had to 
be in the direct integration method. 

Here, acceptance G(x) and event rate R(t) must 
be solutions of the equations (6) to account for on- 
source events. In practice, Monte Carlo integra- 
tion is performed by substituting each real event's 
arrival time by a new time from the list of reg- 
istered times of collected events in a finite time 
window. This is why the method is referred to 
as time swapping method. The swapping is re- 
peated (3 times per each real event, (3 typically 
being around 10. The event rate R(t) is consid- 
ered to be constant on the very short time scale 
and therefore is saved as a histogram. Gener- 
ated event times are drawn from it. The sam- 
ple from G(x) is generated using events from the 
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(a) Excess number of events (N s — Nf,) as a function of 
Galactic Latitude. Source region is not excluded. 
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(b) Excess number of events (N s — Ni,) as a function of 
Galactic Latitude. The region of ±7° around Galactic 
equator is excluded from background estimation. 



outside of the excluded region and should contain 
N(x) — G(x) J R(t)dt events with given local co- 
ordinates x. However, the number of events avail- 
able is N out {x) = G{x) J ip(x,t)R(t)dt. Therefore, 
instead of swapping each event (3 times, missing 
events are created by choosing actual number of 
swaps from a Poisson distribution with parameter 
(1 + a'(x))(3 where 



l + a'(x) 



G{x) J R(t)dt _ G{x) J R(t)dt 



G{x) J i){x,t)R{t)dt N out (x) 



The significance calculation has to reflect the 
fact that the time swapping method is a Monte 
Carlo integration and thus introduces additional 
fluctuations in the estimate of Nb. The integration 
error reduces as the number of generated events 
increases or equivalently as /3 increases and the 
fluctuations in iVj, approach that of the direct in- 
tegration method. Use of statistic U' (equation 3) 
provides a transparent way of including these ad- 
ditional fluctuations. It can be shown (Fleysher, 
R. 2003) that the statistic V within framework 
of time swapping must be substituted by 



U'(x) 



N s — N b 



^V s +^a(x)iV b (x)+jV/3' 



(8) 



a(x) 



N b (x) 
N out {x) 



The fact that the source region defined by 
4>{x, t) does not have to be discretized is the ad- 
vantage of the time swapping method. Otherwise, 
it is based on the same assumptions as the direct 
integration method: stability of the detector and 
isotropy of the background (equation 4). 

4.4. Known anisotropies. 

It was assumed in the above discussion that no 
anisotropy on the sky is present. This, together 
with the stability assumption had lead to the equa- 
tion (4). In fact, if there are known sources on the 
sky, then the number of registered events is given 
by: 



Fig. 4.— Plots showing the results of Monte Carlo 
simulations with uniform Galactic signal flux be- 
ing 0.0088 that of background in the region of ±5° 
around the Galactic equator. The expected Galac- 
tic bin content is about 205000. 



dN(x, t) = (1 + S(x, t)) ■ G(x) ■ R(t) dxdt (9) 

where S(x, t) describes the strength of the 
sources as function of local coordinates and time. 
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For example, the Crab nebula is known to emit 
gamma rays in the TeV energy region (Weeks et al. 
1989; Atkins et al. 2003). Because the anisotropy 
function S(x, t) is not known, the region around 
the Crab has to be excluded from the background 
estimation even if the nebula is not the subject of 
investigation. 

Another, more dramatic example is given by 
two known cosmic-ray sinks on the sky: the Sun 
and the Moon (Samuelson 2001). Not only do 
they present a source of anisotropy, they also tra- 
verse the sky, blocking on their way potential can- 
didates and perturbing on-source count N s as well 
as Nb. This can be handled by vetoing certain 
size regions around the objects, that is treating 
them as part of the excluded region during integra- 
tion (equations 5 and 6) and disregarding events 
if they fall within the veto region when counting 
on-source events N s . In other words, if v{x,t) 
is the function describing the veto region where 
it is equal to zero and equal to one everywhere 
else, then excluded region ij){x,t) and source re- 
gion (j)(x,t) have to be redefined as: 

f V(M) := i>(x,t) ■ u(x,t) 

\ cf>(x,t) := 1 - (1 - <f>(x,t)) ■ v(x,t) 

In general, existing small scale anisotropies can 
be excluded or vetoed as described, known large 
scale ones have to be incorporated into the stabil- 
ity assumption. These will become a part of the 
null hypothesis being tested. 

Incorporation of the improved stability assump- 
tion (9) into the framework of direct integration 
method is straightforward: 

N b = j J(l-<f>(x,t))(l + S(x,t))G{x)R(t) dxdt 

The anisotropy function S(x,t) must be dis- 
cretized on the same grid as G(x) and R{t) are. 
In order to incorporate the improved stability as- 
sumption into the framework of time swapping 
method, the generated events {xi,ti) must repre- 
sent a sample from (1+S(x, t))G(x)R(t) which can 
be achieved with the help of the rejection method. 

4.5. Detector stability assumption re- 
examined. 

Despite the fact that no reconfigurations to the 
detector on the short time scale are made, the 



acceptance of the array G(x) depends on trans- 
mission properties of the atmosphere which may 
vary during the integration time window (equa- 
tion 5). In this case the stability assumption (4) 
is violated and must be replaced by assumption 
(9), where S(x,t) describes the atmospheric vari- 
ations. Thus, atmosphere must be considered as 
an integral part of the detector and we refer to 
the phenomenon in general as detector instability. 
If the variations are known, they can be incor- 
porated into background estimation as described 
above. The remainder of this section presents one 
method of determining such variations and shows 
how they are incorporated into S(x,t). 

The test of the stability assumption would be a 
comparison of two acceptances Gi(x) and Gj{x), 
i 7^ j measured at different times t, and tj. On 
physical grounds, a detector usually possesses a 
certain degree of azimuthal symmetry, so does the 
atmosphere, therefore acceptance is considered as 
function of zenith and azimuth angles G(z,A). 
The histograms Gi(z, A) and Gj(z, A) can be col- 
lected from the data for a certain duration of time 
(for example 30 minutes) around U and tj . For the 
purpose of background estimation the time scale 
during which the stability assumption (4) holds 
must be ascertained. Therefore, the test is aimed 
at studying the difference between the distribu- 
tions as function of time separation At = ti — tj . 

It has to be recognized that presence of sources 
or large scale anisotropies on the sky and insta- 
bility of the detector mimic each other, therefore, 
zenith and azimuth angle distributions alone are 
compared instead of two dimensional G(x)'s. The 
test can be implemented as a series of x 2 tests 
of Gi(x) and Gj(x) (yielding x 2 (ti,tj)) and then 
obtaining the combined Xj ota; (At) for time sepa- 
ration At = ti — tj : 

X 2 total (At)= X 2 (U,t,) 

The test statistic Xj ota; (At) so obtained follows 
&X 2 distribution with m to tai = YsAt=n-t - m (ti,tj) 
degrees of freedom if observed differences are of 
random nature only. Here m(ti,tj) are the num- 
ber of degrees of freedom in the corresponding 
X 2 (ti,tj) tests. The average of Xtotai i s equal to 
m to tai while its variance is equal to 2m to tai- Thus, 
the Xtotai P er degree of freedom should fluctuate 
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Fig. 5. — Results of stability assumption test 
with regard to zenith coordinate using Milagro 
data. Horizontal axis is time separation At with 
30 minute bins, vertical axis is corresponding 
Xtotai (A*) / Tfhotal (At) . Solid horizontal line is the 
expected value of one if the stability assumption 
holds. 

around 1.0. Examining the dependence of Xtotai 
on time separation At it is possible to test the 
detector stability assumption and to ascertain the 
proper integration time window. If detector insta- 
bility is recognized, care must be taken to improve 
the stability assumption. 

4-5.1. Illustration of diurnal modulations using 
Milagro. 

Figure 5 is an example of the results of the de- 
tector stability assumption test using Milagro data 
with regard to zenith coordinate. (For description 
of Milagro please see Atkins et al. (2000).) It is 
seen from the plots that the degree of violation of 
the assumption grows with time separation At as 
might be expected, but then it drops before grow- 
ing again. This can be interpreted as presence of 
a periodic component which insured that two ac- 
ceptances Gi(z) and Gj(z) separated by 24 UT 
hours are "closer" to each other than, say, those 
separated by only 12. Thus, despite the fact that 
no human intervention on the short time scale is 
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Fig. 6. — An example of the zenith correction 
function A(z) derived from Milagro data. 

made, the acceptance of the detector changes. 

Because the diurnal periodicity is noted, the in- 
vestigation of the modulation can be performed by 
comparing a particular distribution with its daily 
average. It was observed that the shape of the 
modulation (A(z)) of the zenith distribution is ap- 
proximately constant with amplitude varying from 
half hour to half hour. Therefore, the improved 
stability assumption is chosen to be of the form: 

dN(x, t) = (1 + 6{t)A{z)G{x)R{t) dxdt (10) 

where 9(t) is the amplitude of the correction 
at time t, A(z) is the polynomial zenith angle 
correction function coefficients of which are ob- 
tained from the modulation shape study, G(x) is 
the average acceptance of the detector obtained 
from equations (6). The example of the correction 
function is shown on figure 6. The example of the 
average daily amplitude dependence is shown on 
figure 7. The value of the amplitude is typically 
within ±4 • 10~ 5 range. The plot can also be used 
to justify the choice of half hour intervals for the 
amplitude measurement. The assumption (10) be- 
comes part of the null hypothesis. 
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Fig. 7. — An example of the average daily depen- 
dence of the zenith correction amplitude 9 derived 
from Milagro data. 

5. Conclusions. 

We have considered a typical air shower experi- 
ment conducted by means of two observations and 
have discussed two commonly used tests (based on 
statistics U and U') and conditions of their ap- 
plicability. A careful look at the situation where 
an astrophysical object traverses the large field 
of view of a detector had led us into the subject 
of background estimation. We have developed a 
method of background estimation which is consis- 
tent with the use of either statistics U or U' and 
have discussed two implementations of it: direct 
integration and time swapping. The background 
estimation method is based on widely adopted as- 
sumptions of short time scale stability of the de- 
tector operation and that of isotropy of cosmic 
ray background. We have discussed a way to re- 
lax the short time scale stability assumption and 
used Milagro data to illustrate the situation where 
presence of zenith diurnal modulations can easily 
be incorporated into the background estimation 
method. More generally, this is also the way to 
incorporate known large scale anisotropies. Small 
scale anisotropies do not have to be known, exist- 
ing ones can be handled by excluding or vetoing 
the regions around them. Any method based on 



the assumption of short time scale stability of the 
detector operation and that of isotropy of cosmic 
ray background can not be used for detection of 
stationary in the field of view objects. 

While the methods and ideas presented in this 
paper were developed for a gamma-ray air shower 
array, we believe that the methods can also find 
their applications outside of the field of gamma 
ray astronomy. The properties of the significance 
test can be useful for any counting type experi- 
ment in which number of events follows Poisson 
distribution, the background estimation method 
can be used with any large field of view detector, 
where the object of investigation traverses the field 
of view, such as in solar neutrino monitors or is 
transient such as in SuperNovae neutrino observa- 
tories. 
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